Goto

Collaborating Authors

 transfer gap




TacCap: A Wearable FBG-Based Tactile Sensor for Seamless Human-to-Robot Skill Transfer

Xing, Chengyi, Li, Hao, Wei, Yi-Lin, Ren, Tian-Ao, Tu, Tianyu, Lin, Yuhao, Schumann, Elizabeth, Zheng, Wei-Shi, Cutkosky, Mark R.

arXiv.org Artificial Intelligence

Tactile sensing is essential for dexterous manipulation, yet large-scale human demonstration datasets lack tactile feedback, limiting their effectiveness in skill transfer to robots. To address this, we introduce TacCap, a wearable Fiber Bragg Grating (FBG)-based tactile sensor designed for seamless human-to-robot transfer. TacCap is lightweight, durable, and immune to electromagnetic interference, making it ideal for real-world data collection. We detail its design and fabrication, evaluate its sensitivity, repeatability, and cross-sensor consistency, and assess its effectiveness through grasp stability prediction and ablation studies. Our results demonstrate that TacCap enables transferable tactile data collection, bridging the gap between human demonstrations and robotic execution. To support further research and development, we open-source our hardware design and software.


An Empirical Study of Scaling Laws for Transfer

Barnett, Matthew

arXiv.org Artificial Intelligence

In recent years, a number of papers have uncovered machine learning scaling laws--defined as empirical regularities that describe how the performance of a model increases as a function of scale, usually in parameter count and data (Hestness et al. 2017, Kaplan et al. 2020, Hoffmann et al. 2022). Hernandez et al. 2021 described scaling laws for transfer learning, showing how the transfer learning properties of models change as a function of model size. The primary result was that the degree of transfer--as measured by the amount of effective data transferred from one distribution to another--follows a simple power law in parameter count and fine-tuning data size. However, their analysis left much room for further exploration, as it only considered transfer learning from English to Python, and did not explore the relationship between the pre-training data size and the degree of downstream transfer learning. Scaling laws for transfer are important to study because they inform the degree to which progress in machine learning is bottlenecked by data for specific tasks. Consider that to achieve high performance on some tasks, one standard approach in the foundation model paradigm is to pre-train a model on a large, diverse distribution and then fine-tune it on a particular downstream task (Bommasani et al. 2022).


Analyzing and reducing the synthetic-to-real transfer gap in Music Information Retrieval: the task of automatic drum transcription

Zehren, Mickaël, Alunno, Marco, Bientinesi, Paolo

arXiv.org Artificial Intelligence

Automatic drum transcription is a critical tool in Music Information Retrieval for extracting and analyzing the rhythm of a music track, but it is limited by the size of the datasets available for training. A popular method used to increase the amount of data is by generating them synthetically from music scores rendered with virtual instruments. This method can produce a virtually infinite quantity of tracks, but empirical evidence shows that models trained on previously created synthetic datasets do not transfer well to real tracks. In this work, besides increasing the amount of data, we identify and evaluate three more strategies that practitioners can use to improve the realism of the generated data and, thus, narrow the synthetic-to-real transfer gap. To explore their efficacy, we used them to build a new synthetic dataset and then we measured how the performance of a model scales and, specifically, at what value it will stagnate when increasing the number of training tracks for different datasets. By doing this, we were able to prove that the aforementioned strategies contribute to make our dataset the one with the most realistic data distribution and the lowest synthetic-to-real transfer gap among the synthetic datasets we evaluated. We conclude by highlighting the limits of training with infinite data in drum transcription and we show how they can be overcome.


BERT Goes Off-Topic: Investigating the Domain Transfer Challenge using Genre Classification

Roussinov, Dmitri, Sharoff, Serge

arXiv.org Artificial Intelligence

While performance of many text classification tasks has been recently improved due to Pre-trained Language Models (PLMs), in this paper we show that they still suffer from a performance gap when the underlying distribution of topics changes. For example, a genre classifier trained on \textit{political} topics often fails when tested on documents about \textit{sport} or \textit{medicine}. In this work, we quantify this phenomenon empirically with a large corpus and a large set of topics. Consequently, we verify that domain transfer remains challenging both for classic PLMs, such as BERT, and for modern large models, such as GPT-3. We also suggest and successfully test a possible remedy: after augmenting the training dataset with topically-controlled synthetic texts, the F1 score improves by up to 50\% for some topics, nearing on-topic training results, while others show little to no improvement. While our empirical results focus on genre classification, our methodology is applicable to other classification tasks such as gender, authorship, or sentiment classification. The code and data to replicate the experiments are available at https://github.com/dminus1/genre


Boosting Cross-lingual Transferability in Multilingual Models via In-Context Learning

Kim, Sunkyoung, Ki, Dayeon, Kim, Yireun, Lee, Jinsik

arXiv.org Artificial Intelligence

Existing cross-lingual transfer (CLT) prompting methods are only concerned with monolingual demonstration examples in the source language. In this paper, we propose In-CLT, a novel cross-lingual transfer prompting method that leverages both source and target languages to construct the demonstration examples. We conduct comprehensive evaluations on multilingual benchmarks, focusing on question answering tasks. Experiment results show that In-CLT prompt not only improves multilingual models' cross-lingual transferability, but also demonstrates remarkable unseen language generalization ability. In-CLT prompting, in particular, improves model performance by 10 to 20\% points on average when compared to prior cross-lingual transfer approaches. We also observe the surprising performance gain on the other multilingual benchmarks, especially in reasoning tasks. Furthermore, we investigate the relationship between lexical similarity and pre-training corpora in terms of the cross-lingual transfer gap.


Revisiting Machine Translation for Cross-lingual Classification

Artetxe, Mikel, Goswami, Vedanuj, Bhosale, Shruti, Fan, Angela, Zettlemoyer, Luke

arXiv.org Artificial Intelligence

Machine Translation (MT) has been widely used for cross-lingual classification, either by translating the test set into English and running inference with a monolingual model (translate-test), or translating the training set into the target languages and finetuning a multilingual model (translate-train). However, most research in the area focuses on the multilingual models rather than the MT component. We show that, by using a stronger MT system and mitigating the mismatch between training on original text and running inference on machine translated text, translate-test can do substantially better than previously assumed. The optimal approach, however, is highly task dependent, as we identify various sources of cross-lingual transfer gap that affect different tasks and approaches differently. Our work calls into question the dominance of multilingual models for cross-lingual classification, and prompts to pay more attention to MT-based baselines.


Measuring Cross-Lingual Transferability of Multilingual Transformers on Sentence Classification

Chi, Zewen, Huang, Heyan, Mao, Xian-Ling

arXiv.org Artificial Intelligence

Recent studies have exhibited remarkable capabilities of pre-trained multilingual Transformers, especially cross-lingual transferability. However, current methods do not measure cross-lingual transferability well, hindering the understanding of multilingual Transformers. In this paper, we propose IGap, a cross-lingual transferability metric for multilingual Transformers on sentence classification tasks. IGap takes training error into consideration, and can also estimate transferability without end-task data. Experimental results show that IGap outperforms baseline metrics for transferability measuring and transfer direction ranking. Besides, we conduct extensive systematic experiments where we compare transferability among various multilingual Transformers, fine-tuning algorithms, and transfer directions. More importantly, our results reveal three findings about cross-lingual transfer, which helps us to better understand multilingual Transformers.


Characterizing and Avoiding Negative Transfer

Wang, Zirui, Dai, Zihang, Póczos, Barnabás, Carbonell, Jaime

arXiv.org Machine Learning

When labeled data is scarce for a specific target task, transfer learning often offers an effective solution by utilizing data from a related source task. However, when transferring knowledge from a less related source, it may inversely hurt the target performance, a phenomenon known as negative transfer. Despite its pervasiveness, negative transfer is usually described in an informal manner, lacking rigorous definition, careful analysis, or systematic treatment. This paper proposes a formal definition of negative transfer and analyzes three important aspects thereof. Stemming from this analysis, a novel technique is proposed to circumvent negative transfer by filtering out unrelated source data. Based on adversarial networks, the technique is highly generic and can be applied to a wide range of transfer learning algorithms. The proposed approach is evaluated on six state-of-the-art deep transfer methods via experiments on four benchmark datasets with varying levels of difficulty. Empirically, the proposed method consistently improves the performance of all baseline methods and largely avoids negative transfer, even when the source data is degenerate.